Abstract
Introduction
Detection of minimal residual disease (MRD) of B-cell acute lymphoblastic leukemia (B-ALL) measured by flow cytometry (FC) has been shown to be an important prognostic indicator in B-ALL. However, the current analysis is based on the manual gating by the reviewers; therefore, objectivity could be compromised, and decentralization of the B-ALL MRD detection is difficult to be established. We then aimed to develop an objective diagnostic tool via artificial intelligence (AI) technique for FC data interpretation in order to provide efficient and objective B-ALL MRD detection.
Methods
Retrospective FC data of patients with adult B-ALL from 2009 to 2016 in National Taiwan University Hospital were enrolled. A total of 1,073 unique bone marrow FC samples were collected for analysis. Each of the samples had a 10-tube test done on 100,000 cells - each cell was measured in 6 fluorescent channels (FSC, SSC, FITC, PE, PerCP, APC) within one tube.
The flow conclusion of each FC sample was categorized into "positive MRD" and "no MRD" groups according to previous manual interpretation, and the whole dataset was randomly divided into 3 groups: Training set (n=718), Accuracy set (n=141) and Validation set (n=178). The Training set was used to train the algorithm, and the Accuracy set was used to evaluate the accuracies achieved and optimize algorithmic parameters. The final optimized algorithm was then evaluated blindly in Validation set, with the concordance rate comparing the manual and the AI conclusions in distinguishing "positive MRD" from "no MRD".
For the algorithm development, the recorded numerical values of the 6 fluorescent channels (FSC, SSC, FITC, PE, PerCP, APC) of each tube (100,000 cells) were considered as raw feature attributes. The probabilistic distributions of these attributes were first modeled as multivariate Gaussian mixture model using a sub-dictionary learning approach. Then a probabilistic derivation was exploited to compute per-sample L2-normalized vectorized representations. Lastly, representations of each tube were then concatenated to be the final feature input to the supervised machine learning classifier, in this case, a support vector machine with linear kernel. In addition, ANOVA-based feature selection was also conducted throughout the experiments.
For outcome correlation analysis, we included 35 adult B-ALL patients with available FC data after induction therapy, record their clinical parameters and measured their overall survival (OS) and progression-free survival (PFS), with median follow-up 40.0 months. The Kaplan-Meier curves were constructed to estimated OS and RFS.
Results
The primary algorithm concordance estimated using the Accuracy set was 90.8% and the area under the receiver operating characteristic curve (AUC) is 0.944 (Table1). The calibrated algorithm, adjusted with training in Accuracy set, can produce a concordance rate of 88.2% with manual interpretation on Validation set, and the AUC is 0.919 (Table 1).
It is noteworthy that it only took 7 seconds to analyze one FC sample using the AI algorithm while it took approximately 15 to 30 minutes for manual gating by trained professionals. Furthermore, the concordance of the algorithm developed from 10 tube-data remained almost unchanged when the tube number decreased. In fact, algorithm from 1 tube still yields an AUC of 0.874 (vs. 0.919 from 12 tubes), indicating that AI algorithm may reduce laboratory testing currently applied in standard FC processing.
For outcome correlation, B-ALL patients with no MRD by AI algorithm had significant longer OS compared to those with positive MRD (NR vs 20.0 months, p=0.0051). Moreover, patients with no MRD also had significant longer RFS compared to those with positive MRD (8.9 vs 5.2 months, p=0.019), (Figure.1).
Conclusions
This is the first study using AI approach to develop FC data diagnostic algorithms with prognosis implications for B-ALL, demonstrating the advantages of reliability, high efficacy and time saving. Incorporating other test results into AI algorithm may lead to more precise prognostic prediction in the future and potentially guide therapeutic strategy for B-ALL.
Ko: Celgene International Sàrl: Research Funding. Li: Celgene International Sàrl: Research Funding. Tien: Celgene International Sàrl: Research Funding. Tang: Celgene International Sàrl: Research Funding.
Author notes
Asterisk with author names denotes non-ASH members.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal